A Novel Technique for Optimizing the Hidden Layer Architecture in Artificial Neural Networks
نویسندگان
چکیده
Artificial neural networks have been showed their effectiveness in many real world problems such as signal processing, pattern recognition, and classification problems. Although they provide highly generalized solutions, we find several unanswered problems in using artificial neural networks. Determining the most appropriate architecture of artificial neural network is identified as one of those major problems. Generally, the performance of a neural network strongly depends on the size of the network. By increasing the number of layers generalization ability can be improved. However, this solution may not be computationally optimized. On the other hand, too many hidden neurons may over-train the data and which cause the poor generalization. Also, too few neurons under-fit the data and hence, network may not train the data properly. Thus, both too many and too few neurons show bad generalization. Therefore, determining the most suitable architecture is very important in artificial neural networks. As such, a large number of researchers have been carried out to model the hidden layer architecture by using various techniques. These techniques can be categorized as pruning techniques and constructive techniques. Pruning algorithms start with an oversized network and remove nodes until the optimal architecture occurs [1],[2],[3],[4] and [12]. Constructive algorithms [5],[6],[7],[8] do the other way. They build the appropriate neural network during the training process by adding hidden layers, nodes and connection weights to a minimal architecture. However, most of these methods are confined to networks with small number of neurons or single hidden layer neural networks. Hence, they have not addressed the existing problem of hidden layer architecture properly. In this paper, a new pruning algorithm based on backpropagation training [11] has been proposed to design the optimal neural network. The optimal solution is obtained by two steps. First, the number of hidden layers in the most efficient network is determined. Then the network tends to the optimal solution by removing all unimportant nodes from each layer. The removable nodes are identified through the delta values of hidden neurons [9],[10]. The choosing of delta values was based on the fact that the delta values of the hidden layers are used to compute the error term of the next training cycle. Hence, delta value is a significant factor in error term. Thus, the delta values are used to identify the less saliency neurons and remove them from hidden neurons so that the error term tends to the desired limit faster than the backpropagation training. The approaches of the other researchers are discussed in the next section. Section III describes the new algorithm and how to use the delta values in optimization of hidden layer architecture. The experimental method and results are discussed in section IV. Finally, section V presents the conclusions. Abstract: The architecture of an artificial neural network has a great impact on the generalization power. More precisely, by changing the number of layers and neurons in each hidden layer generalization ability can be significantly changed. Therefore, the architecture is crucial in artificial neural network and hence, determining the hidden layer architecture has become a research challenge. In this paper a pruning technique has been presented to obtain an appropriate architecture based on the backpropagation training algorithm. Pruning is done by using the delta values of hidden layers. The proposed method has been tested with several benchmark problems in artificial neural networks and machine learning. The experimental results have been shown that the modified algorithm reduces the size of the network without degrading the performance. Also it tends to the desired error faster than the backpropagation algorithm.
منابع مشابه
Evaluation of effects of operating parameters on combustible material recovery in coking coal flotation process using artificial neural networks
In this research work, the effects of flotation parameters on coking coal flotation combustible material recovery (CMR) were studied by the artificial neural networks (ANNs) method. The input parameters of the network were the pulp solid weight content, pH, collector dosage, frother dosage, conditioning time, flotation retention time, feed ash content, and rotor rotation speed. In order to sele...
متن کاملPrediction of breeding values for the milk production trait in Iranian Holstein cows applying artificial neural networks
The artificial neural networks, the learning algorithms and mathematical models mimicking the information processing ability of human brain can be used non-linear and complex data. The aim of this study was to predict the breeding values for milk production trait in Iranian Holstein cows applying artificial neural networks. Data on 35167 Iranian Holstein cows recorded between 1998 to 2009 were ...
متن کاملEstimation of coal swelling index based on chemical properties of coal using artificial neural networks
Free swelling index (FSI) is an important parameter for cokeability and combustion of coals. In this research, the effects of chemical properties of coals on the coal free swelling index were studied by artificial neural network methods. The artificial neural networks (ANNs) method was used for 200 datasets to estimate the free swelling index value. In this investigation, ten input parameters ...
متن کاملPrediction of the deformation modulus of rock masses using Artificial Neural Networks and Regression methods
Static deformation modulus is recognized as one of the most important parameters governing the behavior of rock masses. Predictive models for the mechanical properties of rock masses have been used in rock engineering because direct measurement of the properties is difficult due to time and cost constraints. In this method the deformation modulus is estimated indirectly from classification syst...
متن کاملPrediction of the Liquid Vapor Pressure Using the Artificial Neural Network-Group Contribution Method
In this paper, vapor pressure for pure compounds is estimated using the Artificial Neural Networks and a simple Group Contribution Method (ANN–GCM). For model comprehensiveness, materials were chosen from various families. Most of materials are from 12 families. Vapor pressure data of 100 compounds is used to train, validate and test the ANN-GCM model. Va...
متن کاملOptimization of Oleuropein Extraction from Olive Leaves using Artificial Neural Network
In this work, the artificial neural networks (ANN) technology was applied to the simulation of oleuropein extraction process. For this technology, a 3-layer network structure is applied, and the operation factors such as amount of flow intensity ratio, temperature, residence time, and pH are used as input variables of the network, whereas the extraction yield is considere...
متن کامل